Sequencing and Raw Sequence Data Quality Control    ◾    39

do not meet the criteria entirely, this program cuts only the bases, whose quality scores are

less than the specified threshold, from the ends of the reads.

fastq_quality_trimmer \

-i bad_filt.fastq \

-t 28 \

-o bad_filt_trim.fastq \

-Q33

fastqc bad_filt_trim.fastq

htmlfiles=$(ls *.html)

firefox $htmlfiles

The “-t” option specifies the quality threshold, which is the minimum quality score below

which the bases will be trimmed from the ends of the reads. When trimming is performed,

the resulted reads may be of unequal lengths, which may not be accepted by some pro-

grams used in following steps of analysis. As shown in Figure 1.33, although the per base

sequence quality has been improved by trimming, it also raised a sequence length distribu-

tion warning since trimming resulted in reads with unequal lengths. We may need to filter

reads by length.

FIGURE 1.33  The QC report of the filtered and trimmed “bad.fastq” file.